Doctoral Dissertation Barge-in Robust Spoken Dialogue Interface Using Multichannel Sound Field Control and Array Signal Processing
نویسنده
چکیده
A spoken dialogue system is demanded as a user-friendly human-machine interface that does not require any special skills in its manipulation. Speech has advantageous features: they are hands-free and eyes-free, i.e., one can use speech while doing other tasks. For effective utilization of the features, it is desirable that the system can be used even when the user stands away from the microphone or the user’s speech is uttered interrupting the output sound of the system (response sound). The problem in satisfying such demands is the degradation of automatic speech recognition (ASR) because of feedback of response sound and observation of interfering noise due to other sound than the user’s speech. Since current ASR systems are sensitive to noise, a noise reduction method is indispensable. In elimination of the response sound and the interfering noise, an acoustic echo canceller (AEC) and an adaptive beamformer (ABF) are generally used, respectively. In each of the methods, a filter is adapted to eliminate its target noise based on the minimum-mean-squared-error criterion. Thus, when their filters are trained using signals containing sources other than their target noise, their performances degrade severely. To prevent such degradation, the system should detect the times when the observed signals contain sounds other than the target noise, denoted as double-talk detection (DTD). However, accurate DTD is difficult, particularly in such a situation that both response sound and interfering ∗Doctoral Dissertation, Department of Information Processing, Graduate School of Information Science, Nara Institute of Science and Technology, NAIST-IS-DD0561031, September 30, 2007.
منابع مشابه
Interface for Barge-in Free Spoken Dialogue System Combining Adaptive Sound Field Control and Microphone Array
This paper describes a new interface for a barge-in free spoken dialogue system combining an adaptive sound field control and a microphone array. In order to actualize robustness against the change of transfer functions due to the various interferences, the barge-in free spoken dialogue system which uses sound field control and a microphone array has been proposed by one of the authors. However...
متن کاملBARGE−IN FREE SPOKEN DIALOGUE INTERFACE USING NULLSPACE−BASED SOUND FIELD CONTROL AND BEAMFORMING (ThuAmPO4)
This paper describes a new small−scale interface for a barge−in free spoken dialogue system combining a multichannel sound field control and a microphone array, in which the response sound from the system can be canceled out at the microphone points. The conventional method inhibits the user from moving because the system forces the user to stay in the fixed position where the response sound is...
متن کاملInterface for barge-in free spoken dialogue system using adaptive sound field control
This paper describes a new interface for a barge-in free spoken dialogue system combining an adaptive sound field control and a microphone array. In order to actualize robustness against the change of transfer functions due to the various interferences, the barge-in free spoken dialogue system which uses sound field control and a microphone array has been proposed by one of the authors. However...
متن کاملInterface for Barge-in Free Spoken Dialogue System Based on Sound Field Reproduction and Microphone Array
A barge-in free spoken dialogue interface using sound field control and microphone array is proposed. In the conventional spoken dialogue system using an acoustic echo canceller, it is indispensable to estimate a room transfer function, especially when the transfer function is changed by various interferences. However, the estimation is difficult when the user and the system speak simultaneousl...
متن کاملContinuously Predicting and Processing Barge-in During a Live Spoken Dialogue Task
Barge-in enables the user to provide input during system speech, facilitating a more natural and efficient interaction. Standard methods generally focus on singlestage barge-in detection, applying the dialogue policy irrespective of the barge-in context. Unfortunately, this approach performs poorly when used in challenging environments. We propose and evaluate a barge-in processing method that ...
متن کامل